Clinical risk model

ABSTRACT

A model-assisted system and method for predicting health care services. In one implementation, a model-assisted system may comprise a least one processor programmed to access a database storing a medical record associated with a patient and analyze the medical record to identify a characteristic of the patient. The processor may determine a patient risk level indicating a likelihood that the patient will require a health care service within a predetermined time period; compare the patient risk level to a predetermined risk threshold; and generate a report indicating a recommended intervention for the patient. The processor may further determine a calibration factor indicating a difference between an average patient risk level and an average actual healthcare service usage for a first group of patients; and determine, based on the calibration factor, a bias relative to a second group of patients.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.62/990,933, filed on Mar. 17, 2020, and U.S. Provisional Application No.63/106,539, filed on Oct. 28, 2020, the contents of which areincorporated herein by reference in their entirety.

BACKGROUND Technical Field

The present disclosure relates to predicting medical care for patientsand, more specifically, to a model-assisted system and method forpredicting and planning for specific types of health care services forpatients and/or types of clinical outcomes.

Background Information

In today's health care system, significant benefits may potentially berealized by reducing the likelihood that certain individuals or patientswill need to make use of certain types of health care services (e.g.,emergency or hospital room visits, ambulance transportation, emergencyor acute treatments, hospital-based or other urgent care services,etc.). For example, certain types of health care services, such ashospital or other acute care services, may be more costly (and oftensignificantly more costly) and also desirable to avoid, than other typesof health care services, for example, services delivered in locationsoutside of high cost hospital or urgent care centers, such as physicianoffice visits, home care visits, etc. Thus, reducing patients' needs tomake use of certain types of health care services may have the potentialto significantly reduce health care costs for individuals, insurers,group health insurance policy holders, among others, while alsoincreasing the quality of care delivered to patients. For example,reducing the likelihood that one or more patients needs to takeadvantage of hospitals or acute care may reduce the load on suchservices and may free up hospital-based and acute care resources forother patients. Additionally, reducing or eliminating patients' needsfor hospital visits, etc., may reduce the likelihood that such patientsare exposed to infectious diseases that may be contracted from otherhospital patients. Such a benefit may be especially important to certaintypes of patients (e.g., cancer patients, etc.) with compromised immunesystems. Note, the term hospital, as used herein, includes hospitals andother urgent care facilities.

One effective way to potentially reduce the likelihood for patientreliance upon certain types of health care services is to identify thosepatients with the highest likelihood for such future health care serviceusage and preemptively offer medical treatment, care, or otherinterventions (e.g., phone calls, sending prescriptions to pharmacies,electronic communications/reminders, referral to specialist, etc.) tothose patients. For example, health care entities may deploy health careprofessionals to provide preemptive care to any number of patientsranging from a handful of patients (e.g., ten patients) to many patients(e.g., two thousand patients). When scheduling health care professionalsand/or interventions to provide preemptive care (e.g., home visits,phone calls, sending prescriptions to pharmacies, treatment, electroniccommunications/reminders etc.), health care entities may wish toprioritize patients most at risk for certain types of health careservices in the near term to reduce the need for such health careservices. For example, if a health care entity can identify that apatient is dehydrated or needs pain medication (among a variety of otherconditions that may lead to an eventual visit to an emergency room oruse of other hospital or acute care services), scheduling a health careprofessional, or other intervention, to care for the patient may helpminimize or even avoid the need for such hospital or acute careservices. Identifying patients that are at high risk for certain typesof health care services or identifying patient at high risk for certainclinical outcomes (e.g., mortality, febrile neutropenia, depression),and scheduling health care processionals' and/or other interventions, toprovide preemptive care to those patients, in a proactive, preventivemanner may thus improve the care of patients and provide more targetedtreatments to patients. Importantly, such patient identification andproactive interventions may significantly reduce the likelihood thatsuch patients will make use of hospital or acute care, or sufferundesirable clinical outcomes, which may result in any or all of thebenefits discussed above, among others.

Preemptively identifying those patients most likely to take advantage ofhospital or other types of health care services, or patients who are atrisk for certain clinical outcomes, however, can be challenging. Thetypical source of information regarding a patient and the health of thatpatient is the patient's medical chart, which may be maintained as anelectronic health record (EHR). However, it is time consuming andinefficient (and in many cases impossible due to the sheer number ofpatients and the time involved to review each chart) to manually andcontinuously review the charts of each patient in order to identify andprioritize those patients most likely to access hospital-based or othertypes of health services (e.g., within a certain time period) or at riskfor certain clinical outcomes. Moreover, in many cases, human reviewersmay be incapable of recognizing patterns, markers, etc., in an EHR orother medical chart indicative of a patient's likely use of hospital orother health care services or a patient who is at risk for certainclinical outcomes within a certain time period (e.g., 1 week, 30 days,60 days, 90 days, etc.). Accordingly, even if there was time to manuallyreview (such as on an ongoing basis) the medical charts and healthrecords for an entire patient population, such a review would likely beineffective in identifying a suitable population of those patients mostlikely to make use of such health care services. Thus, there is a needfor a technical system to more efficiently and effectively analyzepatient records and identify patients most likely to make near term useof certain types of health care services or patients who are at nearterm risk for certain clinical outcomes.

SUMMARY

Embodiments consistent with the present disclosure include systems andmethods for predicting health care services. In an embodiment, amodel-assisted system may comprise a least one processor. The processormay be programmed to access a database storing medical recordsassociated with a plurality of patients and analyze a medical recordassociated with a patient of the plurality of patients to identify acharacteristic of the patient. The processor may determine, based on thepatient characteristic and using a trained machine learning model, apatient risk level indicating a likelihood that the patient will requirea health care service within a predetermined time period, the machinelearning model being trained based on clinical factors weighted based onan logistic regression. The processor may further compare the patientrisk level to a predetermined risk threshold; generate, based on thecomparison, a report indicating a recommended intervention for thepatient; determine a calibration factor indicating a difference betweenan average patient risk level and an average actual healthcare serviceusage for a first group of the plurality of patients; and determine,based on the calibration factor, a bias associated with the first grouprelative to a second group of the plurality of patients.

In another embodiment, a computer-implemented method for predictinghealth care services is disclosed. The method may comprise accessing adatabase storing medical records associated with a plurality of patientsand analyzing a medical record associated with a patient of theplurality of patients to identify a characteristic associated with thepatient. The method may comprise determining, based on the patientcharacteristic and using a trained machine learning model, a patientrisk level indicating a likelihood that the patient will require ahealth care service within a predetermined time period, the machinelearning model being trained based on clinical factors weighted based onlogistic regression. The method may further comprise comparing thepatient risk level to a predetermined risk threshold; generating, basedon the comparison, a report indicating a recommended intervention forthe patient; determining a calibration factor indicating a differencebetween an average patient risk level and an average actual healthcareservice usage for a first group of the plurality of patients; anddetermining, based on the calibration factor, a bias associated with thefirst group relative to a second group of the plurality of patients.

In an embodiment, a system for evaluating bias in a machine learningmodel may comprise a least one processor. The processor may beprogrammed to receive a plurality of outputs from a machine learningmodel, the outputs comprising predictions for a plurality of patientsbased on medical records associated with the plurality of patients;access a plurality of actual outcomes associated with the plurality ofpatients; determine a calibration factor indicating a difference betweenthe predictions and the actual outcomes for a first group of theplurality of patients; and detect, based on the calibration factor, abias associated with the first group relative to a second group of theplurality of patients.

Consistent with other disclosed embodiments, non-transitory computerreadable storage media may store program instructions, which areexecuted by at least one processing device and perform any of themethods described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute partof this specification, and together with the description, illustrate andserve to explain the principles of various exemplary embodiments. In thedrawings:

FIG. 1 is a diagram illustrating an exemplary system environment forimplementing embodiments consistent with the present disclosure.

FIG. 2 is a diagram illustrating an example process for predictinghealth care services consistent with the disclosed embodiments.

FIG. 3 illustrates an exemplary medical record for a patient consistentwith the disclosed embodiments.

FIG. 4A illustrates an example machine learning system for implementingembodiments consistent with the present disclosure.

FIG. 4B illustrates example bias monitoring that may be performed usingperformance monitoring system 460 consistent with the disclosedembodiments.

FIG. 5 is an illustration of an example report identifying patients withheightened risk of a near-term medical visit consistent with thedisclosed embodiments.

FIG. 6 is a flowchart showing an example process for predicting healthcare services, consistent with the disclosed embodiments.

FIG. 7 is a flowchart showing an example process for evaluating amachine learning model, consistent with the disclosed embodiments.

DETAILED DESCRIPTION

The following detailed description refers to the accompanying drawings.Wherever possible, the same reference numbers are used in the drawingsand the following description to refer to the same or similar parts.While several illustrative embodiments are described herein,modifications, adaptations and other implementations are possible. Forexample, substitutions, additions or modifications may be made to thecomponents illustrated in the drawings, and the illustrative methodsdescribed herein may be modified by substituting, reordering, removing,or adding steps to the disclosed methods. Accordingly, the followingdetailed description is not limited to the disclosed embodiments andexamples. Instead, the proper scope is defined by the appended claims.

Embodiments herein include computer-implemented methods, tangiblenon-transitory computer-readable mediums, and systems. Thecomputer-implemented methods may be executed, for example, by at leastone processor (e.g., a processing device) that receives instructionsfrom a non-transitory computer-readable storage medium. Similarly,systems consistent with the present disclosure may include at least oneprocessor (e.g., a processing device) and memory, and the memory may bea non-transitory computer-readable storage medium. As used herein, anon-transitory computer-readable storage medium refers to any type ofphysical memory on which information or data readable by at least oneprocessor may be stored. Examples include random access memory (RAM),read-only memory (ROM), volatile memory, nonvolatile memory, harddrives, CD ROMs, DVDs, flash drives, disks, and any other known physicalstorage medium. Singular terms, such as “memory” and “computer-readablestorage medium,” may additionally refer to multiple structures, such aplurality of memories and/or computer-readable storage mediums. Asreferred to herein, a “memory” may comprise any type ofcomputer-readable storage medium unless otherwise specified. Acomputer-readable storage medium may store instructions for execution byat least one processor, including instructions for causing the processorto perform steps or stages consistent with an embodiment herein.Additionally, one or more computer-readable storage mediums may beutilized in implementing a computer-implemented method. The term“computer-readable storage medium” should be understood to includetangible items and exclude carrier waves and transient signals. Inaddition, as referred to herein, the terms “health service,” “healthcare service” and “medical service” are used interchangeably.

It is understood that embodiments of the present disclosure may be usedfor the purpose of supporting or providing recommendations to healthcareprofessionals about prevention, diagnosis, or treatment of a disease orcondition. Further, it is understood that embodiments of the presentdisclosure may enable such healthcare professionals to independentlyreview the basis for such recommendations presented by the presentdisclosure, so that such healthcare professionals are primarily relyingon their independent review to make a clinical diagnosis or treatmentdecision regarding an individual patient, and using the recommendationsas supplemental information.

Embodiments of the present disclosure provide systems and methods forpredicting near term use of certain types of health care services forpatients (e.g., hospital or acute care) or predicting certain clinicaloutcomes. A user of the disclosed systems and methods may encompass anyindividual who may wish to access and/or analyze patient data. Thus,throughout this disclosure, references to a “user” of the disclosedsystems and methods may encompass any individual, such as a physician, ahealthcare administrator, a researcher, an insurance adjuster, a qualityassurance department at a health care institution, and/or any otherentity associated with a patient.

FIG. 1 illustrates an exemplary system environment 100 for implementingembodiments consistent with the present disclosure, described in detailbelow. As shown in FIG. 1 , system environment 100 may include severalcomponents, including client devices 110, data sources 120, system 130,and/or network 140. It will be appreciated from this disclosure that thenumber and arrangement of these components is exemplary and provided forpurposes of illustration. Other arrangements and numbers of componentsmay be used without departing from the teachings and embodiments of thepresent disclosure.

As shown in FIG. 1 , exemplary system environment 100 may include asystem 130. System 130 may include one or more server systems,databases, and/or computing systems configured to receive informationfrom entities over a network, process the information, store theinformation, and display/transmit the information to other entities overthe network. Thus, in some embodiments, the network may facilitate cloudsharing, storage, and/or computing. In one embodiment, system 130 mayinclude a processing engine 131 and one or more databases 132, which areillustrated in a region bounded by a dashed line representing system130. Processing engine 131 may comprise at least one processing device,such as one or more generic processors, e.g., a central processing unit(CPU), a graphics processing unit (GPU), or the like and/or one or morespecialized processors, e.g., an application-specific integrated circuit(ASIC), a field-programmable gate array (FPGA), or the like.

The various components of system environment 100 may include an assemblyof hardware, software, and/or firmware, including a memory, a centralprocessing unit (CPU), and/or a user interface. Memory may include anytype of RAM or ROM embodied in a physical storage medium, such asmagnetic storage including floppy disk, hard disk, or magnetic tape;semiconductor storage such as solid-state disk (SSD) or flash memory;optical disc storage; or magneto-optical disc storage. A CPU may includeone or more processors for processing data according to a set ofprogrammable instructions or software stored in the memory. Thefunctions of each processor may be provided by a single dedicatedprocessor or by a plurality of processors. Moreover, processors mayinclude, without limitation, digital signal processor (DSP) hardware, orany other hardware capable of executing software. An optional userinterface may include any type or combination of input/output devices,such as a display monitor, keyboard, and/or mouse.

Data transmitted and/or exchanged within system environment 100 mayoccur over a data interface. As used herein, a data interface mayinclude any boundary across which two or more components of systemenvironment 100 exchange data. For example, environment 100 may exchangedata between software, hardware, databases, devices, humans, or anycombination of the foregoing. Furthermore, it will be appreciated thatany suitable configuration of software, processors, data storagedevices, and networks may be selected to implement the components ofsystem environment 100 and features of related embodiments.

The components of environment 100 (including system 130, client devices110, and data sources 120) may communicate with each other or with othercomponents through a network 140. Network 140 may comprise various typesof networks, such as the Internet, a wired Wide Area Network (WAN), awired Local Area Network (LAN), a wireless WAN (e.g., WiMAX), a wirelessLAN (e.g., IEEE 802.11, etc.), a mesh network, a mobile/cellularnetwork, an enterprise or private data network, a storage area network,a virtual private network using a public network, a nearfieldcommunications technique (e.g., Bluetooth, infrared, etc.), or variousother types of network communications. In some embodiments, thecommunications may take place across two or more of these forms ofnetworks and protocols.

System 130 may be configured to receive and store the data transmittedover network 140 from various data sources, including data sources 120,process the received data, and transmit data and results based on theprocessing to client device 110. For example, system 130 may beconfigured to receive structured and/or unstructured data from one ormore data sources 120 or other sources in network 140. In someembodiments, the data may include medical information stored in the formof one or more medical records. Each medical record may be associatedwith a particular patient. Data sources 120 may be associated with avariety of sources of medical information for a patient. For example,data sources 120 may include medical care providers of the patient, suchas physicians, nurses, specialists, consultants, hospitals, clinics, andthe like. Data sources 120 may also be associated with laboratories suchas radiology or other imaging labs, hematology labs, pathology labs,etc. Data sources 120 may also be associated with insurance companies orany other sources of patient data (e.g., patient reported outcomes,wearable devices that track health information, public health datasetsor registries).

System 130 may further communicate with one or more client devices 110over network 140. For example, system 130 may provide results based onanalysis of information from data sources 120 to client device 110.Client device 110 may include any entity or device capable of receivingor transmitting data over network 140. For example, client device 110may include a computing device, such as a server or a desktop or laptopcomputer. Client device 110 may also include other devices, such as amobile device, a tablet, a wearable device (i.e., smart watches,implantable devices, fitness trackers, etc.), a virtual machine, an IoTdevice, or other various technologies. In some embodiments, system 130may further receive input or queries from client device 110. Forexample, client device 110 may transmit queries for information aboutone or more patients over network 140 to system 130, such as a query forpatients likely to require near-time medical services (e.g., emergencymedical services) within a particular time period, or various otherinformation about a patient.

In some embodiments, system 130 may be configured to analyze medicalrecords (or other forms of structured or unstructured data) of a patientor patient population to determine a risk level, a relative risk level,or any other suitable indicator of a likelihood that one or morepatients will make near term use of certain type of health care services(e.g., hospital or acute care). For example, system 130 may analyzemedical records of a patient to determine whether the patient will makeuse of specified health care services within a specific time window(e.g., the next 60 days). System 130 may be configured to use one ormore machine learning models to identify these probabilities. Suchsystems and methods may provide value for health care entities,individuals, and others, because at risk patients may be preemptivelyidentified and treated before escalation of one or more conditions, theoccurrence of one or more medical events, or other events that may leadto use of hospital or acute health care services. Such a system,therefore, may decrease the need for hospital or acute care and/or lowerthe total cost of patient care, among many other potential benefits. Inanother example, system 130 may prioritize patients based on the type ofhealth care service likely to be needed, where patients expected torequire more serious health care services or more costly health careservices are prioritized before patients expected to require lessserious or less costly health care services.

System 130 may automatically analyze patient records and triage patientsaccording to (i) a likelihood of near-term use of certain types ofmedical services (e.g., hospital or acute care); (ii) a likelihood of aparticular near-term clinical outcome; or (iii) based on any othercriteria. The system may automatically generate this information andpresent it for use via one or more reports, graphical user interfaces,mobile device interfaces, etc. With this information, preemptive carefor patients may be planned, according to, for example: those patientsmost likely to make near-term use of hospital or acute medical services,those patients who are most likely to suffer particular near-termclinical outcomes, those patients for whom treatment is most likely toresult in a reduced likelihood of near-term use of hospital or acutemedical services (e.g., reduction in likelihood by at least apredetermined threshold amount, etc.), or any other group of patientsbased on a predicted impact to one or more health care services or apredicted occurrence of an undesirable clinical outcome. In some cases,system 130 may prioritize patients based on a predicted time that thepatients are expected to make use of a specified health care service.For example, patients predicted to make use of hospital or acute healthcare service within 1-3 days may be prioritized ahead of patientspredicted to make use of such services within 1-2 weeks, 1-2 months,etc.

FIG. 2 is a block diagram illustrating an example process 200 forpredicting health care services or clinical outcomes consistent with thedisclosed embodiments. Process 200 may include receiving medicalinformation, such as medical record 212 from a provider system 210.Provider system 210 may include any system that may collect and/or storemedical data for patients. For example, provider system 210 may includea medical care office, a hospital, a surgical center, a clinic, a lab ortesting facility, a hospice center, an imaging or radiology facility, anaddiction treatment facility, an urgent care facility, a rehabilitationcenter, a telehealth platform, an insurance facility, or any othersystem that may provide access to medical data. Provider system 210 maycorrespond to or may be associated with client devices 110 and/or datasources 120 described above.

As shown in FIG. 2 , medical record 212 may be received from providersystem 210. Medical record 212 may include any structured orunstructured data associated with a patient. In some embodiments,multiple medical records 212 may be received. Each patient may berepresented by one or more records generated by one or more health careprofessionals or by the patient. For example, a doctor associated withthe patient, a nurse associated with the patient, a physical therapistassociated with the patient, or the like, may each generate a medicalrecord for the patient. In some embodiments, one or more records may becollated and/or stored in the same database. In other embodiments, oneor more records may be distributed across a plurality of databases. Insome embodiments, the records may be stored and/or provided a pluralityof electronic data representations. For example, the patient records maybe represented as one or more electronic files, such as text files,portable document format (PDF) files, extensible markup language (XML)files, or the like. If the documents are stored as PDF files, images, orother files without text, the electronic data representations may alsoinclude text associated with the documents derived from an opticalcharacter recognition process. Additional details regarding medicalrecord 212 are provided below with respect to FIG. 3 .

In some embodiments, system 130 may be configured to transform thissource data into a format that is interpretable by trained model 220.For example, these transformations include, but are not limited to,mapping source data to standardized formats and extracting clinicalinformation from unstructured data using machine learning techniques. Insome embodiments, medical record 212 may be associated with a requestfrom provider system 210. For example, provider system 210 may querysystem 130 for near-term risk predictions for a patient or group ofpatients (e.g., which patients are likely to require hospital or acutecare or which patients are likely to experience a particular clinicaloutcome). In some embodiments, medical record 212 may be providedperiodically, in response to a request from system 130, any time aparticular patient has an encounter with a health system (e.g., when themedical record is updated), or based on various other forms of triggers.In some embodiments, the triggers may be configurable by a user (e.g.,by a system administrator, a healthcare provider system, etc.).

Medical record 212 may be input into a trained model 220, which may beconfigured to generate predictions for patients. Trained model 220 maybe included in or otherwise associated with system 130, as describedabove. Trained model 220 may include any trained machine learning modelconfigured to generate risk predictions 222 based on input data. In someembodiments, trained model 220 may include an artificial neural network.Various other machine learning algorithms may be used, including alogistic regression, a linear regression, a regression, a random forest,a K-Nearest Neighbor (KNN) model (for example as described above), aK-Means model, a decision tree, a cox proportional hazards regressionmodel, a Naïve Bayes model, a Support Vector Machines (SVM) model, agradient boosting algorithm, or any other form of machine learning modelor algorithm. Additional details regarding the training andimplementation of trained model 220 are described below with respect toFIG. 4A. Risk predictions 222 may include any information indicating apredicted near-term medical service or clinical outcome for a patient ora group of patients.

Based on the results of trained model 220, an output 224, which mayindicate the risk predictions 222, may be generated and provided toprovider system 210. In some embodiments, output 224 may be provided toone or more computing devices, such as client devices 110, in aphysician's office, home care service, or the like, for presentation ona display associated with the one or more computing devices. Forexample, the reports may be displayed on one or more mobile devices foruse by medical professionals. In some cases, the reports may be part ofa cooperative medical care scheduling system enabling auto-schedulingand tracking of care to patients according to a predicted likelihood ofnear-term medical services. Accordingly, output 224 may also includescheduling preemptive care based on risk predictions 222. As notedabove, a physician or other healthcare professional may review and/oraccept proposed scheduling determined by the scheduling system.

Output 224 may be generated in any suitable form. In some cases, one ormore reports may be generated including a list of patients to receivepreemptive treatment, a list of patients organized by predicted risk foruse of hospital, acute or other specified health service, a listorganized according to predicted risk of using such services and apredicted time or time range when such services are expected to bepursued, or according to any other or additional criteria. In furthercases, one or more reports may be generated including a list of patientswho are at risk for certain clinical outcomes, a list of patientsorganized by predicted risk for occurrence of such outcomes, a listorganized according to predicted risk of the occurrence of such outcomesand a predicted time or time range when such outcomes are likely tooccur, or according to any other or additional criteria. The reports maybe provided in paper form or displayed as part of a user interface. Forexample, the reports may identify one or more patients who have apredetermined risk level (e.g., having a risk higher than apredetermined threshold). Such reports may be available at point of carelocations, in electronic health record systems, etc. Such reports may begenerated according to any desired periodicity (e.g., daily, weekly, innear real time, etc.). An example report that may be generated is shownin FIG. 5 and discussed below in further detail.

Optionally, process 200 may further include a monitoring and tracking aperformance of trained model 220, as shown by performance modeling 230.Performance modeling 230 may monitor one or more inputs or outputs oftrained model 220, including medical record 212 and output 224. Theseinputs and outputs may be analyzed to assess a performance of trainedmodel 220. In some embodiments, performance modeling 230 may accessadditional data, such as historical data, actual results data, or otherdata to assess the performance of trained model 220. For example,performance modeling 230 may be configured to identify potential biasesintroduced into trained model 220, as described further below withrespect to FIG. 4B. In some embodiments, trained model 220 may be tunedor adjusted based on the results of performance modeling 230. Variousother actions may be taken based on the results of performance modeling230, such as generating an alert, flagging output 224, or the like.Additional details regarding performance monitoring techniques aredescribed below with respect to FIGS. 4A and 4B.

FIG. 3 illustrates an exemplary medical record 300 for a patientconsistent with the disclosed embodiments. Medical record 300 may bereceived from provider system 210 and processed by trained model 220 toidentify whether a patient associated medical record 300 is predicted touse near-term medical services or suffer a near-term clinical outcome.Accordingly, medical record 300 may correspond to medical record 212,described above. The records received from provider system 210 mayinclude structured data 310 and/or unstructured data 320, as shown inFIG. 3 .

Structured data 310 may include quantifiable or classifiable data aboutthe patient, such as gender, age, race, weight, vital signs, labresults, date of diagnosis, diagnosis type, disease staging (e.g.,billing codes), therapy timing, procedures performed, visit date,practice type, insurance carrier and start date, medication orders,medication administrations, or any other measurable data about thepatient. Unstructured data 320 may include information about the patientthat is not quantifiable or easily classified, such as physician's notesor the patient's lab reports. Unstructured data 320 may includeinformation such as a physician's description of a treatment plan, notesdescribing what happened at a visit, statements or accounts from apatient, subjective evaluations or descriptions of a patient'swell-being, radiology reports, pathology reports, laboratory reports,etc. Structured data 310 and/or unstructured data 320 may be processedand input into trained model 220. In some embodiments, the unstructureddata may be captured by an abstraction process, while the structureddata may be entered by the health care professional or calculated usingalgorithms.

FIG. 4A illustrates an example machine learning system 400 forimplementing embodiments consistent with the present disclosure. Machinelearning system 400 may implemented as part of system 130 (as shown inFIG. 1 ). For example, machine learning system 400 may be a component ofor a process performed using processing engine 131. In accordance withthe disclosed embodiments, machine learning system 400 may develop anduse a trained model to predict use of near-term patient medical servicesor a particular clinical outcome. For example, as shown in FIG. 4A,machine learning system 400 may construct a trained model 430 foridentifying patients associated with a risk of a near term hospitalservice, acute service or other type of health service. Machine learningsystem 400 may develop model 430 through a training process, forexample, using training algorithm 420.

Training of model 430 may involve the use of a training data set 410,which may be input into training algorithm 420 to develop the model.Training data 410 may include a plurality of patient medical records412, which may include hospital services, acute health services or othertype of health care services provided to patients associated withpatient medical records 412. As an illustrative example, each of medicalrecords 412 may be associated with an effective date (e.g., representinga simulated date the medical record may be accessed) and healthcarevisit data 414 may indicate dates subsequent to the effective date thatthe patient required the hospital service, acute service or other typeof health service. Accordingly, model 430 may be trained to associatevarious feature data within medical records 412 to subsequent hospitalservices, acute services, or other types of health services representedin healthcare visit data 414. In addition to predicting health careservices likely to be provided for a patient based on severity orurgency for the patient or the type of facility or service provided,trained model 430 may be used to predict a wide variety of patientvisits, treatment types, patient clinical outcomes or other purposes.Accordingly, the types of services the system is trained to predict mayvary. In some embodiments, a physician or other healthcare provider mayspecify the types of services of interest. For example, a physician maybe presented with a list of visit types (e.g., hospital visits, urgentcare visits, etc.) or clinical outcomes, and the system may beconfigured to determine patient risk levels associated with the selectedservices or outcomes.

In some embodiments, training data 410 may also be cleaned, conditioned,and/or manipulated prior to input into training algorithm 420 tofacilitate the training process. Machine learning system 400 may extractone or more features (or feature vectors) from the records and applytraining algorithm 420 to determine correlations between the featuresand the subsequent medical visits. These features may be extracted fromstructured and/or unstructured data as described above with respect toFIG. 2 . In some embodiments, the features may include demographicfeatures, such as a patient's gender, race/ethnicity, sexualorientation, age, social indicators (e.g. income level, education level,food insecurity, access to housing and utility services, proximity tofacility, access to caregiver) or other demographic information that maybe included in a patient's medical data. In some embodiments, thefeatures may include clinical data, such as indications of a patient'sdiagnosis, cancer stage (e.g., stage 4 breast cancer), comorbidities(e.g., heart disease), indications of adverse events (e.g., sepsis), labtests and/or results (e.g., out of range albumin), vitals (e.g., weight,height, blood pressure, etc.), visits (e.g., visit to pain specialist),orders (e.g., antiemetic, pain agent), prescriptions, treatments,administrations (e.g., antineoplastic with high emetogenic potential),performance status (e.g. ECOG performance status), reported sideeffects, or any other clinical information that may be represented instructured or unstructured medical data. Trained model 430 may betrained to correlate such factors with actual health care usage oractual patient outcomes. In some cases, training of the model may resultin model weights being assigned according to observed correlationsbetween any of the model parameter inputs and health care usage (e.g.,whether such usage occurs, when such usage occurs, a likelihood thatsuch usage occurs, a probability that such usage occurs, etc.). In somecases, training of the model may result in model weights being assignedaccording to observed correlations between any of the model parameterinputs and occurrence of a particular clinical outcome (e.g., whethersuch outcome occurs, when such outcome occurs, a likelihood that suchoutcome occurs, a probability that such outcome occurs, etc.).

In some embodiments, the features used as inputs to training algorithm420 may be weighted based on an expected degree of relevance as towhether a patient will use a particular near-term medical services orexperience a near-term clinical outcome. For example, features such asparticular diagnoses or previous use of a particular medical service oruse of any medical services generally may be identified as having ahigher expected relevance to predicted future use of medical servicesthan others. Similarly, features such as particular diagnoses orprevious use of a particular medical service or use of any medicalservices generally may be identified as having a higher expectedrelevance to predicted experience of a particular clinical outcome thanothers. Accordingly, training algorithm 420 may receive weightsassociated with one or more features as an input when training trainedmodel 430. These weights may be determined in various ways. In someembodiments, medical care providers, such as physicians, nurses,researchers, insurance specialists, or other practitioners may beconsulted to determine the weights. For example, the weights may bebased on a survey, a poll, a focus group, an interview, a publication,or other form of input from medical providers.

In some embodiments, the weights may be defined by a logisticregression. The magnitude of a feature's coefficient in the logisticregression may define its importance. For example, a greater magnitudemay indicate a greater importance. In the context of a logisticregression, where all predictors are binary, the presence (i.e., a valueof 1) of a predictor (e.g., presence of an abnormal lab) indicates anincrease in the log odds of the outcome by the value of the coefficient.Therefore, if the coefficient is positive, this may represent anincreased likelihood of observing a positive outcome for thatobservation, and, if negative, a decreased likelihood of observing thatoutcome, all other predictors being constant. Thus, the most positivevalues of coefficients are associated with the highest increases inprobability of the outcome being positive (e.g., actually having an ERvisit in 60 days). A predetermined number of predictors with the mostpositive coefficients (e.g., 10, 20, 30, . . . N predictors) may thustreated as top features associated with risk for the outcome and may besurfaced in the output. Various other methods, such as gradient boostedtrees, a Shapley value, a Gini impurity, or other approaches may beused.

Once model 430 is constructed, input data, such as medical records 432,may be input to model 430. Medical records 432 may correspond to medicalrecord 212 and/or 300, as described above. For example, medical records432 may include structured and unstructured data associated with aplurality of patients, such that each patient is associated with one ormore medical records. Trained model 430 may extract features (which mayinclude, but are not limited to, those described above with respect tomedical records 412) from medical records 432 to generate an output 450.In some embodiments, medical records 432 may be processed prior to inputinto trained model 430. This may include extracting features fromunstructured and/or structured data, image analysis (e.g., opticalcharacter recognition (OCR)), natural language processing tools, orvarious other methods. In some embodiments, input to trained model 220may include machine learning outputs from other systems. For example,machine learning output from other systems can include a metastaticNatural Learning Processing Model that is trained to predict a patient'srisk of having a metastatic disease. The predicted probability of apatient having a particular metastatic disease can be provided as inputto trained model 220.

Output 450 may include risk predictions 452, which may correspond torisk prediction 222 described above. Risk predictions 452 may identifypatients associated with medical records 432 that are expected torequire or use certain types of medical services (e.g., hospital-basedor acute care) within a predetermined time period or identify patientswho are likely to experience certain clinical outcomes within apredetermined time period. These risk predictions may be presented invarious ways. In some embodiments, a particular patient's risk may bebinary. For example, a patient may be designated as “high risk” or “lowrisk” (which may be represented as a 1 or 0, or in various other forms).In some embodiments, the risk may be indicated as a probability such as,for example, between 0 and 1 (e.g. patient has a probability of 0.67 forreceiving hospital-based care within 60-days). In addition to (or as analternative to) a likelihood a patient will receive near-term medicalcare, risk predictions 452 may include other predictions for a patientor group of patients. For example, this may include predicted inpatientadmissions, ICU admissions, patient mortality, an adverse event (e.g.,sepsis, febrile neutropenia, etc.), and/or a combination of the above.For example, risk predictions 452 may include an “acute care outcome” ifa patient is predicted to have a hospital visit, an inpatient admission,or an ICU admission within a predetermined number of days.

In some embodiments, output 450 may include one or more reports, asdescribed above. For example, the report may comprise a list of patientsexpected to use emergency or other medical services within apredetermined time period (e.g., patients having been designated as“high risk,” etc.). In some embodiments, the report may include both“high risk” and “low risk” patients. For example, the report may includepatients belonging to a particular group. This may include patientshaving a particular medical condition, patients of a particular medicalprovider, patients of particular demographics, etc. In some embodiments,the patients may be identified by provider system 210. For example, thepatients may be identified as part of a particular query. In someembodiments, the report may only include patients exceeding a particularlikelihood threshold (e.g., 50%, 60%, 70%, 80%, 90%, 99%, etc.) orconfidence value threshold. In some embodiments, the threshold may beadjustable based on desired levels of efficiency and performance. Forexample, system 130 may be configured to receive user inputs to tuneperformance of the model, which may include adjusting the thresholdlikelihood for inclusion in the report.

In some embodiments, system 130 may be configured to generate one ormore preemptive care recommendations 454 as part of output 450.Preemptive care recommendations 454 may be a direct output of trainedmodel 430 or may be generated as a subsequent step based on riskpredictions 452. Preemptive care recommendations (also referred toherein as “recommended interventions”) may include any temporal-basedrecommendations for treatment or care of a patient. For example, thepreemptive care recommendations may include a recommended appointmentwith a particular care provider, an in-home patient visit, enrolling thepatient in a particular treatment plan or facility, prescribing orrefilling a prescription for a patient, providing a treatment to apatient, calling to check in with the patient, or any other eventassociated with care of the patient that may be scheduled. In someembodiments, the preemptive care recommendation may be expected tonegate or reduce the risk of an undesirable near-term medical servicefor the patient (e.g., hospital-based care) or to negate or reduce therisk of an undesirable near-term clinical outcome. For example, trainedmodel 430 may determine that certain features in medical records 432indicate that a patient is expected to use hospital-based care withinthe next 90 days. Accordingly, system 130 may schedule a preventativetreatment, a check-up or other preemptive care that may reduce the riskof the undesirable near-term medical service or clinical outcome.

In some embodiments, the preemptive care recommendation may berepresented as general recommendations for a patient visit (e.g.,flagging the patient for a near-term visit, etc.). In some embodiments,the preemptive care recommendation may include a specific date or daterange in which a patient visit or other preemptive care is recommended.For example, the preemptive care recommendation may be to schedule apatient visit or preventative treatment at a particular date and/or time(e.g., next Tuesday at 9:00 AM). In some embodiments, preemptive carerecommendations may be developed for multiple patients. For example,system 130 may develop a schedule for visiting each of the patients inthe upcoming weeks. Accordingly, system 130 may be configured togenerate scheduling recommendations such that each patient is seen at adifferent time to avoid conflicts. In some embodiments, the schedulingrecommendations may be for a particular practitioner or caregiver. Forexample, if a clinic has three caregivers that provide in-home visits,scheduling recommendations may allocate patients included in the reportto one of the three caregivers and may generate separate schedules foreach of the three caregivers. This may include optimizing the schedulebased on various factors. For example, the schedule may be developed sothat a particular caregiver sees patients located within a predetermineddistance of each other in the same day, or various other optimizations.In some embodiments, system 130 may access a current schedule of ahealthcare provider and/or patient, and may generate the schedulingrequirements based on the current schedule (e.g., to avoid conflicts, toreschedule other events, or the like).

In some embodiments, the report and/or the preemptive carerecommendations (e.g., scheduling a patient visit) may be developedbased on a priority level for each patient. The priority level may beany information indicating a relative priority among the patients. Thepriority may be based on various factors. For example, patients morelikely to use hospital-based or other type of medical services within aparticular time period may be given higher priority. In someembodiments, trained model 430 may also output an expected date when apatient is expected to use near-term medical services. Accordingly, thepriority may be based on how soon the patient is expected to require oruse a medical service. The priority may be based on other factors, suchas the patient's medical condition, a severity of the condition, anexpected clinical outcome, an urgency of a procedure expected to berequired by the patient, an urgency of an expected emergency or otherservice, the patient's age, or various other factors. The report may besorted or filtered based on the priority level of the patients. In someembodiments, the preemptive care recommendations may also be based onthe priority level. For example, an appointment schedule mayautomatically be generated to schedule visits with the highest prioritypatients before visits to patients with lower priorities.

In some embodiments, machine learning system 400 may include aperformance monitoring component 460, which may correspond toperformance monitoring 230 described above with respect to FIG. 2 .Performance monitoring component 460 may be configured to monitor inputsand/or outputs of model 430 to assess the performance of model 430.Performance monitoring can comprise, for example, data qualitymonitoring, performance monitoring of the predictions and biasmonitoring of the predictions. In some embodiments, performancemonitoring component 460 may be configured to detect anomalous inputs tomodel 430. For example, while model 430 may be trained such that itconfidently predicts likelihoods of the use of near-term medicalservices or near-term clinical outcomes, anomalous input values mayaffect the results. Accordingly, as shown in FIG. 4A, performancemonitoring component 460 may analyze inputs into model 430. This mayinclude observing one or more of the features described above, such asprior usage of emergency services or medical services generally, medicaldiagnoses, lab results, vital signs, age, ethnicity, gender, stage ofdisease, prescribed medications, medication dosage, reported sideeffects, prescribed treatments, or any other relevant factors.Performance monitoring component 460 may perform a statistical analysison these input features and detect anomalous values for these features.

In some embodiments, this may include tracking historical data for theinput features and comparing current inputs to the historical data. Thiscomparison may be performed in a variety of ways. In some embodiments,the statistical analysis may include keeping a running average (e.g., asimple moving average, an exponential moving average, a smoothed movingaverage, a linear weighted moving average, etc.) and comparing thecurrent input to the moving average value. If the difference between themoving average value and the current input exceed a threshold degree ofvariation, the input may be marked as anomalous. While a moving averageis provided by way of example, it is understood that various otherstatistical analyses may be used.

Alternatively, or in addition to detecting anomalous input values,performance monitoring 460 may analyze other potential causes ofdecreased performance of the model. In some embodiments, performancemonitoring component 460 may detect stale data feeds, for example, bytracking a number of novel encounters for patients. This may alsoinclude determining whether selected subsets of data (e.g., tables) havebecome stale, for example, by tracking incremental row counts in sourcedata feeds across tables. The performance may also be based on detectinglag with particular data fields, detecting inaccuracies in predictions,detecting bias in predictions, etc.

In some embodiments, performance monitoring 460 may includefunctionality for identifying and quantifying biases that may be trainedinto trained model 430. Based on this identification and quantification,an administrator or other user may correct trained model 430 to avoidgenerating biased predictions. Such biases may develop during thetraining process for trained model 430, as described above with respectto FIG. 4A. For example, in some cases, training data set 410 mayinadvertently reflect inherent social biases associated with medicalcare, such as unequal access to medical care among ethnic or othergroups. Because trained model 430 is developed based on this data, itmay “learn” this inequality and may include biases in predicting medicalcare usage among different races or other social groups.

To avoid perpetuating biases from the training data set 410 into riskpredictions 452 and/or preemptive care recommendations 454, performancemonitoring system 460 may analyze output 450 to quantify potentialbiases among various groups. FIG. 4B illustrates example bias monitoring480 that may be performed using performance monitoring system 460consistent with the disclosed embodiments. This monitoring may includedetermining a calibration factor for a plurality of groups of patients.For example, as shown in FIG. 4B, calibration factors may be determinedfor a plurality of ethnic groups (e.g., Asian, Black, Hispanic, White,Other, etc.). The calibration factor may be defined as CF=(Avg. ERUse)−(Avg. Predicted Risk) for each group of interest, where CF is thecalibration factor, which represents the difference between the averageusage of a particular medical service (e.g., hospital or acute medicalservices) minus the average predicted risk of the particular medicalservices within the group. These values may be determinedretrospectively to verify the accuracy of model 430.

Ideally, the calibration factor (CF) would be zero for any given group,indicating the model is accurately predicting risk for the group. Anon-zero calibration factor may indicate that the model is inaccuratelypredicting risk for the group, which may indicate a bias in the model,especially when the calibration factor for one group varies as comparedto other groups. A calibration factor greater than zero may indicate themodel is systematically underpredicting risk for a particular group,where a calibration factor less than zero may indicate the model issystematically over-predicting risk for the group. For example,calibration factor 482 may indicate an overprediction, whereascalibration factor 484 may indicate an underprediction. In someembodiments, performance monitoring system 460 may also determine otherstatistical values associated with the calibration factors, such asconfidence intervals 492 and 494. For example, confidence intervals 492and 494 may represent a range in which the model has a confidence levelof 95% (or various other confidence levels, such as 99%, 90%, etc.) thatthe true calibration factor is within the range. The calibration factorsand/or confidence intervals may be determined using stratified samplingor any other suitable sampling methods.

Based on the determined calibration factors, biases in the model may beidentified and addressed. For example, an administrator may recognize asystematic bias in underpredicting risk associated with calibrationfactor 484 and may take necessary remedial action. For example, this mayinclude re-training the model, applying a correction factor to one ormore variables within the model, flagging the predictions associatedwith this group, generating an alert (e.g., to client devices 110,provider system 210, etc.) or various other control actions that may beperformed. While FIG. 4B illustrates calibration-factors associated withracial biases, performance monitoring system 460 may similarly monitorfor biases based on other groups, including gender, sexual orientation,ethnicity, religion, social class, or various other groups.

In some embodiments, bias can be avoided or minimized by monitoringtraining data sets 410 and medical records 432 and appropriatelyselecting data and medical records for inclusion into training data set410. For example, by choosing medical records associated with patientswhere the included outcomes are more direct proxies for risk and bettercapture patients' health, by using a sub-cohort of a given patient groupthat has a high degree of data completeness. In further embodiments, avariable may be introduced into trained model 430 for a given patientgroup, while examining the coefficients associated with that group.

In some embodiments, performance monitoring can be achieved by examiningthe risk predictions against observed outcomes to track the accuracy ofthe predictions.

FIG. 5 is an illustration of an example report 500 identifying patientswith heightened risk of a near-term use of particular medical servicesconsistent with the disclosed embodiments. For example, report 500 mayinclude a group of patients (identified as Patient 1, Patient 2, Patient3, Patient 4, and Patient 5) having a likelihood of a use of near-termhospital-based medical services exceeding a particular threshold. Report500 may be associated with a particular time period. For example, report500 may include any patients having a risk of use of hospital-basedmedical services visit within a predetermined window (e.g., the nextweek, the next 30 days, the next 60 days, the next 90 days, etc.). Thispredetermined time window may be a default value, may be selected by amedical practitioner (e.g., through client devices 110), or may bedefined in various other ways. In some embodiments, report 500 mayinclude patients having recent encounters (e.g., recent medical visits,recent updates to their medical records, etc.). For example, this mayinclude patients having an encounter in the last day, the last 3 days,the past week, past month, etc. In some embodiments, the patients inreport 500 may be defined by provider system 210. For example, a querymay define particular patients, a particular category of patients (e.g.,including a particular feature), or the like.

Report 500 may include demographic information, such as the patient'sgender or date of birth, as shown in FIG. 5 . Various other demographicdata may be included in report 500, such as the patient'srace/ethnicity, gender, sexual orientation, an encounter date, a patientidentifier, whether they are enrolled in Medicare, or various otherinformation. In some embodiments, report 500 may be customizable suchthat a user may define which information is included.

Report 500 may also include information identifying or describe thebasis for the predicted likelihood determination. For example, as shownin FIG. 5 , report 500 may include a number of risk predictors andindicators (e.g., indicator 501) of whether the particular risk factorapplies to each patient. In the example shown in FIG. 5 , this mayinclude vital signs associated with the patient, lab results, medicalorders within a certain time period, patient visits, or diagnosisinformation. This may represent a predetermined number of the mostrelevant features to the trained model, as described above. In someembodiments, report 500 may also include an overall risk assessmentindicating the patient's overall risk of near-term medical care. Forexample, report 500 may include an index, number, or numerical rangeindicative of a likelihood that an individual patient may make near termuse of hospital-based or other medical services. Although not shown inFIG. 5 , report 500 may also provide an estimate of a time, time range,etc. within which a particular patient may be predicted to make use ofsuch near-term medical services. In some embodiments a similar reportmay be generated to identify patients with heightened risk of anear-term clinical outcome.

FIG. 6 is a flowchart showing an example process 600 for predictinghealth care services or clinical outcomes, consistent with the disclosedembodiments. Process 600 may be performed by at least one processingdevice, such as processing engine 131, as described above. It is to beunderstood that throughout the present disclosure, the term “processor”is used as a shorthand for “at least one processor.” In other words, aprocessor may include one or more structures that perform logicoperations whether such structures are collocated, connected, ordisbursed. In some embodiments, a non-transitory computer readablemedium may contain instructions that when executed by a processor causethe processor to perform process 600. Further, process 600 is notnecessarily limited to the steps shown in FIG. 6 , and any steps orprocesses of the various embodiments described throughout the presentdisclosure may also be included in process 600, including thosedescribed above with respect to FIGS. 1-5 .

In step 610, process 600 may include accessing a database storing amedical record associated with a patient. System 130 may access patientmedical records from local database 132 or from an external data source,such as data sources 120. For example, medical record 212 may beprovided by a provider system 210. The medical record may comprise oneor more electronic files, such as text files, image files, PDF files,XLM files, YAML files, or the like. The medical records may includestructured data (e.g., structured data 310) and/or unstructured data(e.g., unstructured data 320), as described above.

In step 620, process 600 may include analyzing the medical record toidentify a characteristic of the patient. The characteristic of thepatient may include any characteristic represented in the unstructuredor structured data in the medical record. For example, thecharacteristic may include prior use of medical services by the patient,an indication of a medical diagnosis for the patient, a laboratoryand/or diagnostic test result for the patient, a vital sign for thepatient, or various other characteristics (e.g., features) describedabove. In some embodiments, the medical record may comprise structureddata associated with the patient, as described above. Accordingly,analyzing the medical record may comprise analyzing the structured data.Similarly, the medical record may comprise unstructured data associatedwith the patient, and analyzing the medical record may compriseanalyzing the unstructured data.

In step 630, process 600 may include determining, based on the patientcharacteristic and using a trained machine learning model, a patientrisk level indicating a likelihood that the patient will require medicalservices within a predetermined time period or will experience aspecific clinical outcome within a predetermined time period. Forexample, trained model 430 may be used to generate risk predictions 452,as described above with respect to FIG. 4 . The predetermined timeperiod may be any suitable period during which the use of medicalservices may be predicted based on medical record data. For example, thepredetermined time period may be less than one week, less than onemonth, less than three months, or any other time period, as describedabove. In some embodiments, the predetermined time period may be definedas a default value. In other embodiments, the predetermined period maybe a user-defined period (e.g., defined by a healthcare provider,administrator or other user, etc.). In some embodiments, the machinelearning model being trained based on clinical factors weighted based onlogistic regression. For example, a logistic regression or otherweighting algorithm may be used to define a relative ranking forfeatures input into training algorithm 420. As described above, thefactors may be weighted based on input received through a survey, afocus group, a training process, or the like.

In step 640, process 600 may include comparing the patient risk level toa predetermined risk threshold. The predetermined risk threshold may beany value (e.g., a number, percentage, binary value, etc.) definingwhich patients should be included in a report. Accordingly, in step 650,process 600 may include generating, based on the comparison, a reportindicating a recommended intervention for the patient. For example,system 130 may generate report 500 as an output of trained model 430, asdescribed above. In some embodiments, process 600 may further includetransmitting the report to a healthcare entity. For example, the reportmay be transmitted to provider system 210, as described above. Thereport may be configured to be displayed on a client device of theprovider system, such as client devices 110. The recommendedintervention for the patient may include any event associated with thecare of the patient, as described above. For example, the recommendedintervention may include a recommended treatment, patient evaluation,in-home care visit, prescription or refill, check-in call, or the like.The recommendation may be an explicit recommendation for an intervention(e.g., “Recommendation: perform in-home visit for “Patient 3” nextTuesday at 3:00 PM”) or may be an implicit recommendation (e.g., byvirtue of the patient being included in the report).

In step 660, process 600 may include determining a calibration factorindicating a difference between an average patient risk level and anaverage actual healthcare service usage for a first group of theplurality of patients. For example, the calibration factor maycorrespond to calibration factor 482 as described above with respect toFIG. 4B. The first group of patients may share a common trait orcharacteristic. In some embodiments, the common characteristic may be acommon demographic classification, as described above. In someembodiments, step 660 may further include determining a confidenceinterval representing a range of values for the calibration factorassociated with a particular degree of confidence. For example, theconfidence factor may correspond to confidence factor 492 describedabove with respect to FIG. 4B.

In step 650, process 600 may include determining, based on thecalibration factor, a bias associated with the first group relative to asecond group of the plurality of patients. The second group of patientsmay have at least one trait or characteristic different from the firstgroup. For example, first group may comprise patients having a firstethnicity and the second group comprises patients having a secondethnicity, as discussed above. In some embodiments, process 600 mayperform additional actions based on detecting the bias. For example,process 600 may further include generating a report indicating the bias.

As described above, the system may further be configured to generate aschedule for patient visits or other recommended interventions.Accordingly, process 600 may further include scheduling at least one ofa preemptive treatment, intervention or a visit for the patient based onthe comparison with the threshold. In some embodiments, the recommendedintervention may be intended to prevent or negate the need for themedical service. For example, if a patient is expected to require anemergency room visit for a particular adverse event, the recommendedintervention may target conditions related to the adverse event toprevent the adverse event from occurring. In some embodiments, process600 may further include generating, based on the patient risk level, apriority level for the patient, as described above. Accordingly, therecommended intervention may be scheduled based on the priority level.In some embodiments, generating the report may comprise including thepatient in a list of a plurality of patients to receive an interventionwithin the predetermined time period. In such embodiments, the list maybe organized based on at least one of predicted patient risk levels forthe plurality of patients or predicted timeframes for medical servicesfor the plurality of patients.

In some embodiments, the report may be generated for a plurality ofpatients. Accordingly, process 600 may further include generatingreports indicating recommended interventions for a plurality of patientsand scheduling, based on the reports, recommended interventions for theplurality of patients within the predetermined time period.

FIG. 7 is a flowchart showing an example process 700 for evaluating amachine learning model for potential bias, consistent with the disclosedembodiments. Process 700 may be performed by at least one processingdevice, such as processing engine 131, as described above. Process 700may be associated with the process performed by performance monitoring460, as described above. In some embodiments, a non-transitory computerreadable medium may contain instructions that when executed by aprocessor cause the processor to perform process 700. Further, process700 is not necessarily limited to the steps shown in FIG. 7 , and anysteps or processes of the various embodiments described throughout thepresent disclosure may also be included in process 700, including thosedescribed above with respect to FIGS. 1-6 .

In step 710, process 700 may include receiving a plurality of outputsfrom a machine learning model. The outputs may comprise predictions fora plurality of patients based on medical records associated with theplurality of patients. For example, the machine learning model maycorrespond to trained model 220 (or trained model 430) and accordingly,the outputs may correspond to output 224 (or output 450) as describedabove.

In step 720, process 700 may include accessing a plurality of actualoutcomes associated with the plurality of patients. The actual outcomesmay indicate whether the predictions for the plurality of patientsincluded in the outputs were correct. For example, if the outputincludes a prediction of a risk of a patient requiring a particularmedical service (such as risk prediction 452), the actual outcome mayinclude an indication of whether the patient received the particularmedical service. As another example, the outcome may include a predictedclinical outcome for a patient, and the actual outcome may indicatewhether the patient experienced the predicted clinical outcome.Accordingly, the actual outcome data may be collected after the outputhas been generated. In some embodiments this may include accessingupdated medical records for the plurality of patients. In someembodiments, the actual outcomes may be included in a training data set,such as training data set 410. Accordingly, process 700 may be performedduring a training phase for a trained model.

In step 730, process 700 may include determining a calibration factorindicating a difference between the predictions and the actual outcomesfor a first group of the plurality of patients. For example, step 730may include determining calibration factor 482 as described above withrespect to FIG. 4B. Accordingly, the calibration factor may representthe difference between an average usage of a particular medical service(e.g., hospital or acute medical services) minus the average predictedrisk of the particular medical services within the first group. Thegroup of patients may be defined based on a variety of characteristics.In some embodiments, the first group may include patients of aparticular ethnic group (e.g., Asian, Black, Hispanic, White, Other,etc.). The first group may be determined based on other factors, such asgender, sexual orientation, ethnicity, religion, social class, orvarious other patient groups. In some embodiments, process 700 mayinclude determining other statistical values associated with thecalibration factor. For example, process 700 may further includedetermining a confidence interval representing a range of values for thecalibration factor associated with a particular degree of confidence,such as confidence interval 492.

In step 740, process 700 may include detecting, based on the calibrationfactor, a bias associated with the first group relative to a secondgroup of the plurality of patients. As noted above, the first and secondgroups may be selected on various social factors that may reflect a biasin the model. For example, the first group may comprise patients havinga first ethnicity and the second group may comprise patients having asecond ethnicity. Accordingly, the bias may reflect a bias based onethnicity of patients inherent in the training data. This bias may bedetected in various ways. In some embodiments, the bias may be detectedbased on a comparison of calibration factors between multiple groups.For example, process 700 may include determining an additionalcalibration factor indicating a difference between the predictions andthe actual outcomes for the second group of the plurality of patients.For example, process 700 may include determining calibration factor 484as described above with respect to FIG. 4B. Accordingly, detecting thebias may comprise comparing the calibration factor to the additionalcalibration factor.

In some embodiments, process 700 may include additional actions takenbased on detecting the bias. For example, process 700 may includegenerating a report indicating the bias. For example, the report may begenerated and transmitted to a user or other administrator, a healthcareprovider, or the like. In some embodiments, process 700 may includeproviding a recommendation for at least one of: re-training the machinelearning model, applying a correction factor, or flagging the pluralityof outputs, as described above with respect to FIG. 4B.

The foregoing description has been presented for purposes ofillustration. It is not exhaustive and is not limited to the preciseforms or embodiments disclosed. Modifications and adaptations will beapparent to those skilled in the art from consideration of thespecification and practice of the disclosed embodiments. Additionally,although aspects of the disclosed embodiments are described as beingstored in memory, one skilled in the art will appreciate that theseaspects can also be stored on other types of computer readable media,such as secondary storage devices, for example, hard disks or CD ROM, orother forms of RAM or ROM, USB media, DVD, Blu-ray, 4K Ultra HD Blu-ray,or other optical drive media.

Computer programs based on the written description and disclosed methodsare within the skill of an experienced developer. The various programsor program modules can be created using any of the techniques known toone skilled in the art or can be designed in connection with existingsoftware. For example, program sections or program modules can bedesigned in or by means of .Net Framework, .Net Compact Framework (andrelated languages, such as Visual Basic, C, etc.), Java, Python, R, C++,Objective-C, HTML, HTML/AJAX combinations, XML, or HTML with includedJava applets.

Moreover, while illustrative embodiments have been described herein, thescope of any and all embodiments having equivalent elements,modifications, omissions, combinations (e.g., of aspects across variousembodiments), adaptations and/or alterations as would be appreciated bythose skilled in the art based on the present disclosure. Thelimitations in the claims are to be interpreted broadly based on thelanguage employed in the claims and not limited to examples described inthe present specification or during the prosecution of the application.The examples are to be construed as non-exclusive. Furthermore, thesteps of the disclosed methods may be modified in any manner, includingby reordering steps and/or inserting or deleting steps. It is intended,therefore, that the specification and examples be considered asillustrative only, with a true scope and spirit being indicated by thefollowing claims and their full scope of equivalents.

1. A model-assisted system for predicting health care services, thesystem comprising: at least one processor programmed to: access adatabase storing medical records associated with a plurality ofpatients; analyze a medical record associated with a patient of theplurality of patients to identify a characteristic of the patient;determine, based on the patient characteristic and using a trainedmachine learning model, a patient risk level indicating a likelihoodthat the patient will require a health care service within apredetermined time period, the machine learning model being trainedbased on clinical factors weighted based on logistic regression; comparethe patient risk level to a predetermined risk threshold; and generate,based on the comparison, a report indicating a recommended interventionfor the patient; determine a calibration factor indicating a differencebetween an average patient risk level and an average actual healthcareservice usage for a first group of the plurality of patients; anddetermine, based on the calibration factor, a bias associated with thefirst group relative to a second group of the plurality of patients. 2.The system of claim 1, wherein the medical record comprises structureddata associated with the patient and analyzing the medical recordcomprises analyzing the structured data.
 3. The system of claim 1,wherein the medical record comprises unstructured data associated withthe patient and analyzing the medical record comprises analyzing theunstructured data.
 4. The system of claim 1, wherein the patientcharacteristic comprises prior use of medical services by the patient.5. The system of claim 1, wherein the patient characteristic comprisesan indication of a medical diagnosis for the patient.
 6. The system ofclaim 1, wherein the patient characteristic comprises at least one of alaboratory or diagnostic test result for the patient.
 7. The system ofclaim 1, wherein the at least one processor is further configured toschedule the recommended intervention for the patient.
 8. The system ofclaim 10, wherein the at least one processor is further configured togenerate, based on the patient risk level, a priority level for thepatient, and wherein the recommended intervention is scheduled based onthe priority level.
 9. The system of claim 1, wherein the at least oneprocessor is further configured to: generate reports indicatingrecommended interventions for a plurality of patients; and schedule,based on the reports, the recommended interventions for the plurality ofpatients within the predetermined time period.
 10. The system of claim1, wherein the at least one processor is further configured to generatea report indicating the bias.
 11. The system of claim 1, wherein thefirst group comprises patients having a first ethnicity and the secondgroup comprises patients having a second ethnicity.
 12. The system ofclaim 1, wherein the at least one processor is further configured todetermine a confidence interval representing a range of values for thecalibration factor associated with a particular degree of confidence.13. A computer-assisted method for predicting health care services, themethod comprising: accessing a database storing a medical recordassociated with a patient; analyzing the medical record to identify acharacteristic associated with the patient; determining, based on thepatient characteristic and using a trained machine learning model, apatient risk level indicating a likelihood that the patient will requirea health care service within a predetermined time period, the machinelearning model being trained based on clinical factors weighted based onlogistic regression; comparing the patient risk level to a predeterminedrisk threshold; and generating, based on the comparison, a reportindicating a recommended intervention for the patient; determining acalibration factor indicating a difference between an average patientrisk level and an average actual healthcare service usage for a firstgroup of the plurality of patients; and determining, based on thecalibration factor, a bias associated with the first group relative to asecond group of the plurality of patients.
 14. The method of claim 13,wherein the medical record comprises structured data associated with thepatient and analyzing the medical record comprises analyzing thestructured data.
 15. The method of claim 13, wherein the medical recordcomprises unstructured data associated with the patient and analyzingthe medical record comprises analyzing the unstructured data.
 16. Themethod of claim 13, wherein the patient characteristic comprises prioruse of medical services by the patient.
 17. The method of claim 13,wherein the patient characteristic comprises an indication of a medicaldiagnosis for the patient.
 18. The method of claim 13, wherein thepatient characteristic comprises at least one of a laboratory ordiagnostic test result for the patient.
 19. The method of claim 13,wherein the method further comprises scheduling the recommendedintervention for the patient.
 20. The method of claim 19, wherein themethod further comprises generating, based on the patient risk level, apriority level for the patient, and wherein the recommended interventionis scheduled based on the priority level. 21-27. (canceled)